fix(castore): introduce fast path while getting torrent metadata to reduce allocs#611
fix(castore): introduce fast path while getting torrent metadata to reduce allocs#611sambhav-jain-16 wants to merge 5 commits intouber:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Improves CAStore.GetCacheFileMetadata performance when serving torrent metadata from the in-memory cache by avoiding per-read serialization/deserialization, and adds coverage/benchmarks to validate allocation behavior.
Changes:
- Add a fast path in
CAStore.GetCacheFileMetadatato return the cached*core.MetaInfopointer directly for*metadata.TorrentMeta. - Add a unit test asserting the no-copy behavior when reading metadata from the memory cache.
- Add a benchmark suite to measure allocations/time across different blob sizes and piece counts for memory-cache metadata reads.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
lib/store/ca_store.go |
Returns cached *core.MetaInfo directly for torrent metadata reads from memory cache, avoiding JSON round-trips. |
lib/store/ca_store_test.go |
Adds a no-copy unit test and a benchmark measuring metadata read allocations across scenarios. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
b1e3e09 to
42420f8
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Do we want to remove the serialisation path until other metadata types are added and return an error instead? In the current implementation, this path would yield a silent invariant violation, as torrent metadata is currently the only supported type.
There was a problem hiding this comment.
I want to be as defensive as possible with this change to avoid any regressions. Since this is the current production behavior, let's keep it this way, if you feel strongly against it, we can discuss more.
There was a problem hiding this comment.
This function is also called for metadata.Persist here
kraken/origin/blobserver/server.go
Lines 1026 to 1029 in f862dd5
It's better we keep it this way to be cleaner
There was a problem hiding this comment.
Since this PR aims to remove the need for serialisation by returning a metadata pointer instead, would it not make more sense to either completely remove it or at least log an error as we expect that path to never be reached?
When we call it using metadata.Persist, the metadata is expected to be fetched from disk. Could you elaborate on the relevance?
There was a problem hiding this comment.
@sambhav-jain-16 Do I understand correctly that ALWAYS expect the if check to be true? If so, can we emit an error log in the else case, so we catch a potential invariant violation if it happens?
There was a problem hiding this comment.
@thijmv @Anton-Kalpakchiev
I took another look, and I agree with your suggestions.
When we call it using metadata.Persist, the metadata is expected to be fetched from disk. Could you elaborate on the relevance?
I agree on this, given that as of now we are not using the in-mem cache unto upload path, this case shouldn't happen.
I'll make the changes and update
There was a problem hiding this comment.
@Anton-Kalpakchiev @thijmv
Made the changes. PTAL
6aaaf7d to
33637e0
Compare
33637e0 to
d4a51c1
Compare
| // hand back cached pointer | ||
| tm.MetaInfo = entry.MetaInfo | ||
| return nil |
| // Verify if the asked metadata is TorrentMetdata or not | ||
| tm, ok := md.(*metadata.TorrentMeta) | ||
| if !ok { | ||
| return fmt.Errorf("unvariant violation: GetCacheFileMetadata should only be called for TorrentMetadata") |
What?
TorrentMeta.Serialize/Deserializeround-trip svia JSON (json.Marshal + json.Unmarshal) for a MetaInfo. For every blob this allocates a fresh[]bytebuffer and a new*core.MetaInfoon every call toGetCacheFileMetadata. The fast path short-circuits it when the entry is already in the memory cache it hands back the existing*core.MetaInfopointer directly, so a metadata read becomes a single pointer assignment with zero heap allocations.Why only TorrentMetadata ?
TorrentMetais the only metadata type whose backing value (*core.MetaInfo) is immutable after creation. Other metadata types (e.g.LastAccessTime) are mutable values written on every access.Testing
Added a benchmark
BenchmarkGetCacheFileMetadata_MemoryCachewith different blob sizes and metadata sizesFollowing is the
benchstatcomparison.